Skip to content

Conversation

DNXie
Copy link
Member

@DNXie DNXie commented Aug 28, 2025

Added log for

  • loss: total/kl/policy (avg per batch)
  • reward (avg per 10 rollout)
  • ratio mean and std (avg per batch)
  • response length (avg per batch)
  • advantages (avg per batch)
image

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Aug 28, 2025
@DNXie DNXie closed this by deleting the head repository Oct 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant